Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 21
Filter
Add more filters










Publication year range
2.
Multivariate Behav Res ; : 1-21, 2024 Apr 09.
Article in English | MEDLINE | ID: mdl-38594939

ABSTRACT

Item omissions in large-scale assessments may occur for various reasons, ranging from disengagement to not being capable of solving the item and giving up. Current response-time-based classification approaches allow researchers to implement different treatments of item omissions presumably going back to different mechanisms. These approaches, however, are limited in that they require a clear-cut decision on the underlying missingness mechanism and do not allow to take the uncertainty in classification into account. We present a response-time-based model-based mixture modeling approach that overcomes this limitation. The approach (a) facilitates disentangling item omissions stemming from disengagement from those going back to solution behavior, (b) considers the uncertainty in omission classification, (c) allows for omission mechanisms to vary on the item-by-examinee level, (d) supports investigating person and item characteristics associated with different types of omission behavior, and (e) gives researchers flexibility in deciding on how to handle different types of omissions. The approach exhibits good parameter recovery under realistic research conditions. We illustrate the approach on data from the Programme for the International Assessment of Adult Competencies 2012 and compare it against previous classification approaches for item omissions.

3.
Psychol Methods ; 2024 Feb 29.
Article in English | MEDLINE | ID: mdl-38421768

ABSTRACT

Ecological momentary assessment (EMA) involves repeated real-time sampling of respondents' current behaviors and experiences. The intensive repeated assessment imposes an increased burden on respondents, rendering EMAs vulnerable to respondent noncompliance and/or careless and insufficient effort responding (C/IER). We developed a mixture modeling approach that equips researchers with a tool for (a) gauging the degree of C/IER contamination of their EMA data and (b) studying the trajectory of C/IER across the study. For separating attentive from C/IER behavior, the approach leverages collateral information from screen times, which are routinely recorded in electronically administered EMAs, and translates theoretical considerations on respondents' behavior into component models for attentive and careless screen times as well as for the functional form of C/IER trajectories. We show how a sensible choice of component models (a) allows disentangling short screen times due to C/IER from familiarity effects due to repeated exposure to the same measures, (b) aids in gaining a fine-grained understanding of C/IER trajectories by distinguishing within-day from between-day effects, and (c) allows investigating interindividual differences in attentiveness. The approach shows good parameter recovery when attentive and C/IER screen time distributions exhibit sufficient separation and yields valid conclusions even in scenarios of uncontaminated data. The approach is illustrated on EMA data from the German Socio-Economic Panel innovation sample. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

4.
Behav Res Methods ; 56(2): 804-825, 2024 Feb.
Article in English | MEDLINE | ID: mdl-36867339

ABSTRACT

Careless and insufficient effort responding (C/IER) poses a major threat to the quality of large-scale survey data. Traditional indicator-based procedures for its detection are limited in that they are only sensitive to specific types of C/IER behavior, such as straight lining or rapid responding, rely on arbitrary threshold settings, and do not allow taking the uncertainty of C/IER classification into account. Overcoming these limitations, we develop a two-step screen-time-based weighting procedure for computer-administered surveys. The procedure allows considering the uncertainty in C/IER identification, is agnostic towards the specific types of C/IE response patterns, and can feasibly be integrated with common analysis workflows for large-scale survey data. In Step 1, we draw on mixture modeling to identify subcomponents of log screen time distributions presumably stemming from C/IER. In Step 2, the analysis model of choice is applied to item response data, with respondents' posterior class probabilities being employed to downweigh response patterns according to their probability of stemming from C/IER. We illustrate the approach on a sample of more than 400,000 respondents being administered 48 scales of the PISA 2018 background questionnaire. We gather supporting validity evidence by investigating relationships between C/IER proportions and screen characteristics that entail higher cognitive burden, such as screen position and text length, relating identified C/IER proportions to other indicators of C/IER as well as by investigating rank-order consistency in C/IER behavior across screens. Finally, in a re-analysis of the PISA 2018 background questionnaire data, we investigate the impact of the C/IER adjustments on country-level comparisons.


Subject(s)
Screen Time , Humans , Surveys and Questionnaires , Probability , Uncertainty
5.
Psychol Methods ; 2023 Dec 21.
Article in English | MEDLINE | ID: mdl-38127572

ABSTRACT

Network psychometrics leverages pairwise Markov random fields to depict conditional dependencies among a set of psychological variables as undirected edge-weighted graphs. Researchers often intend to compare such psychometric networks across subpopulations, and recent methodological advances provide invariance tests of differences in subpopulation networks. What remains missing, though, is an analogue to an effect size measure that quantifies differences in psychometric networks. We address this gap by complementing recent advances for investigating whether psychometric networks differ with an intuitive similarity measure quantifying the extent to which networks differ. To this end, we build on graph-theoretic approaches and propose a similarity measure based on the Frobenius norm of differences in psychometric networks' weighted adjacency matrices. To assess this measure's utility for quantifying differences between psychometric networks, we study how it captures differences in subpopulation network models implied by both latent variable models and Gaussian graphical models. We show that a wide array of network differences translates intuitively into the proposed measure, while the same does not hold true for customary correlation-based comparisons. In a simulation study on finite-sample behavior, we show that the proposed measure yields trustworthy results when population networks differ and sample sizes are sufficiently large, but fails to identify exact similarity when population networks are the same. From these results, we derive a strong recommendation to only use the measure as a complement to a significant test for network similarity. We illustrate potential insights from quantifying psychometric network similarities through cross-country comparisons of human values networks. (PsycInfo Database Record (c) 2023 APA, all rights reserved).

6.
Br J Math Stat Psychol ; 76(3): 623-645, 2023 11.
Article in English | MEDLINE | ID: mdl-36811176

ABSTRACT

Response time modelling is developing rapidly in the field of psychometrics, and its use is growing in psychology. In most applications, component models for response times are modelled jointly with component models for responses, thereby stabilizing estimation of item response theory model parameters and enabling research on a variety of novel substantive research questions. Bayesian estimation techniques facilitate estimation of response time models. Implementations of these models in standard statistical software, however, are still sparse. In this accessible tutorial, we discuss one of the most common response time models-the lognormal response time model-embedded in the hierarchical framework by van der Linden (2007). We provide detailed guidance on how to specify and estimate this model in a Bayesian hierarchical context. One of the strengths of the presented model is its flexibility, which makes it possible to adapt and extend the model according to researchers' needs and hypotheses on response behaviour. We illustrate this based on three recent model extensions: (a) application to non-cognitive data incorporating the distance-difficulty hypothesis, (b) modelling conditional dependencies between response times and responses, and (c) identifying differences in response behaviour via mixture modelling. This tutorial aims to provide a better understanding of the use and utility of response time models, showcases how these models can easily be adapted and extended, and contributes to a growing need for these models to answer novel substantive research questions in both non-cognitive and cognitive contexts.


Subject(s)
Software , Reaction Time/physiology , Bayes Theorem , Psychometrics/methods
7.
Multivariate Behav Res ; 58(3): 560-579, 2023.
Article in English | MEDLINE | ID: mdl-35294313

ABSTRACT

The bivariate Stable Trait, AutoRegressive Trait, and State (STARTS) model provides a general approach for estimating reciprocal effects between constructs over time. However, previous research has shown that this model is difficult to estimate using the maximum likelihood (ML) method (e.g., nonconvergence). In this article, we introduce a Bayesian approach for estimating the bivariate STARTS model and implement it in the software Stan. We discuss issues of model parameterization and show how appropriate prior distributions for model parameters can be selected. Specifically, we propose the four-parameter beta distribution as a flexible prior distribution for the autoregressive and cross-lagged effects. Using a simulation study, we show that the proposed Bayesian approach provides more accurate estimates than ML estimation in challenging data constellations. An example is presented to illustrate how the Bayesian approach can be used to stabilize the parameter estimates of the bivariate STARTS model.


Subject(s)
Software , Bayes Theorem , Monte Carlo Method , Markov Chains , Computer Simulation
8.
Psychol Methods ; 28(3): 527-557, 2023 Jun.
Article in English | MEDLINE | ID: mdl-34928675

ABSTRACT

Small sample structural equation modeling (SEM) may exhibit serious estimation problems, such as failure to converge, inadmissible solutions, and unstable parameter estimates. A vast literature has compared the performance of different solutions for small sample SEM in contrast to unconstrained maximum likelihood (ML) estimation. Less is known, however, on the gains and pitfalls of different solutions in contrast to each other. Focusing on three current solutions-constrained ML, Bayesian methods using Markov chain Monte Carlo techniques, and fixed reliability single indicator (SI) approaches-we bridge this gap. When doing so, we evaluate the potential and boundaries of different parameterizations, constraints, and weakly informative prior distributions for improving the quality of the estimation procedure and stabilizing parameter estimates. The performance of all approaches is compared in a simulation study. Under conditions with low reliabilities, Bayesian methods without additional prior information by far outperform constrained ML in terms of accuracy of parameter estimates as well as the worst-performing fixed reliability SI approach and do not perform worse than the best-performing fixed reliability SI approach. Under conditions with high reliabilities, constrained ML shows good performance. Both constrained ML and Bayesian methods exhibit conservative to acceptable Type I error rates. Fixed reliability SI approaches are prone to undercoverage and severe inflation of Type I error rates. Stabilizing effects on Bayesian parameter estimates can be achieved even with mildly incorrect prior information. In an empirical example, we illustrate the practical importance of carefully choosing the method of analysis for small sample SEM. (PsycInfo Database Record (c) 2023 APA, all rights reserved).


Subject(s)
Bayes Theorem , Humans , Latent Class Analysis , Reproducibility of Results , Computer Simulation , Monte Carlo Method
9.
Behav Res Methods ; 55(3): 1392-1412, 2023 04.
Article in English | MEDLINE | ID: mdl-35650385

ABSTRACT

Early detection of risk of failure on interactive tasks comes with great potential for better understanding how examinees differ in their initial behavior as well as for adaptively tailoring interactive tasks to examinees' competence levels. Drawing on procedures originating in shopper intent prediction on e-commerce platforms, we introduce and showcase a machine learning-based procedure that leverages early-window clickstream data for systematically investigating early predictability of behavioral outcomes on interactive tasks. We derive features related to the occurrence, frequency, sequentiality, and timing of performed actions from early-window clickstreams and use extreme gradient boosting for classification. Multiple measures are suggested to evaluate the quality and utility of early predictions. The procedure is outlined by investigating early predictability of failure on two PIAAC 2012 Problem Solving in Technology Rich Environments (PSTRE) tasks. We investigated early windows of varying size in terms of time and in terms of actions. We achieved good prediction performance at stages where examinees had, on average, at least two thirds of their solution process ahead of them, and the vast majority of examinees who failed could potentially be detected to be at risk before completing the task. In-depth analyses revealed different features to be indicative of success and failure at different stages of the solution process, thereby highlighting the potential of the applied procedure for gaining a finer-grained understanding of the trajectories of behavioral patterns on interactive tasks.


Subject(s)
Machine Learning , Problem Solving , Humans
10.
Educ Psychol Meas ; 82(5): 845-879, 2022 Oct.
Article in English | MEDLINE | ID: mdl-35989730

ABSTRACT

Disengaged item responses pose a threat to the validity of the results provided by large-scale assessments. Several procedures for identifying disengaged responses on the basis of observed response times have been suggested, and item response theory (IRT) models for response engagement have been proposed. We outline that response time-based procedures for classifying response engagement and IRT models for response engagement are based on common ideas, and we propose the distinction between independent and dependent latent class IRT models. In all IRT models considered, response engagement is represented by an item-level latent class variable, but the models assume that response times either reflect or predict engagement. We summarize existing IRT models that belong to each group and extend them to increase their flexibility. Furthermore, we propose a flexible multilevel mixture IRT framework in which all IRT models can be estimated by means of marginal maximum likelihood. The framework is based on the widespread Mplus software, thereby making the procedure accessible to a broad audience. The procedures are illustrated on the basis of publicly available large-scale data. Our results show that the different IRT models for response engagement provided slightly different adjustments of item parameters of individuals' proficiency estimates relative to a conventional IRT model.

11.
Br J Math Stat Psychol ; 75(3): 668-698, 2022 11.
Article in English | MEDLINE | ID: mdl-35730351

ABSTRACT

Careless and insufficient effort responding (C/IER) on self-report measures results in responses that do not reflect the trait to be measured, thereby posing a major threat to the quality of survey data. Reliable approaches for detecting C/IER aid in increasing the validity of inferences being made from survey data. First, once detected, C/IER can be taken into account in data analysis. Second, approaches for detecting C/IER support a better understanding of its occurrence, which facilitates designing surveys that curb the prevalence of C/IER. Previous approaches for detecting C/IER are limited in that they identify C/IER at the aggregate respondent or scale level, thereby hindering investigations of item characteristics evoking C/IER. We propose an explanatory mixture item response theory model that supports identifying and modelling C/IER at the respondent-by-item level, can detect a wide array of C/IER patterns, and facilitates a deeper understanding of item characteristics associated with its occurrence. As the approach only requires raw response data, it is applicable to data from paper-and-pencil and online surveys. The model shows good parameter recovery and can well handle the simultaneous occurrence of multiple types of C/IER patterns in simulated data. The approach is illustrated on a publicly available Big Five inventory data set, where we found later item positions to be associated with higher C/IER probabilities. We gathered initial supporting validity evidence for the proposed approach by investigating agreement with multiple commonly employed indicators of C/IER.


Subject(s)
Self Report , Humans , Surveys and Questionnaires
13.
Psychometrika ; 87(2): 593-619, 2022 06.
Article in English | MEDLINE | ID: mdl-34855118

ABSTRACT

Careless and insufficient effort responding (C/IER) can pose a major threat to data quality and, as such, to validity of inferences drawn from questionnaire data. A rich body of methods aiming at its detection has been developed. Most of these methods can detect only specific types of C/IER patterns. However, typically different types of C/IER patterns occur within one data set and need to be accounted for. We present a model-based approach for detecting manifold manifestations of C/IER at once. This is achieved by leveraging response time (RT) information available from computer-administered questionnaires and integrating theoretical considerations on C/IER with recent psychometric modeling approaches. The approach a) takes the specifics of attentive response behavior on questionnaires into account by incorporating the distance-difficulty hypothesis, b) allows for attentiveness to vary on the screen-by-respondent level, c) allows for respondents with different trait and speed levels to differ in their attentiveness, and d) at once deals with various response patterns arising from C/IER. The approach makes use of item-level RTs. An adapted version for aggregated RTs is presented that supports screening for C/IER behavior on the respondent level. Parameter recovery is investigated in a simulation study. The approach is illustrated in an empirical example, comparing different RT measures and contrasting the proposed model-based procedure against indicator-based multiple-hurdle approaches.


Subject(s)
Psychometrics , Computer Simulation , Psychometrics/methods , Reaction Time , Self Report , Surveys and Questionnaires
14.
Front Psychol ; 12: 604526, 2021.
Article in English | MEDLINE | ID: mdl-34276461

ABSTRACT

Previous research suggests that parental attachment is stable throughout emerging adulthood. However, the relationships between the mutual attachments in the dyads of emerging adults and their parents are still unclear. Our study examines the stability and change in dyadic attachment. We asked 574 emerging adults and 463 parents at four occasions over 1 year about their mutual attachments. We used a latent state-trait model with autoregressive effects to estimate the time consistency of the attachments. Attachment was very stable, and earlier measurement occasions could explain more than 60% of the reliable variance. Changes of attachment over time showed an accumulation of situational effects for emerging adults but not for their parents. We estimated the correlations of the mutual attachments over time using a novel multi-rater latent state-trait model with autoregressive effects. This model showed that the mutual attachments of parents and emerging adults were moderately to highly correlated. Our model allows to separate the stable attachment from the changing attachment. The correlations between the mutual attachments were higher for the stable elements of attachment than for the changing elements of attachment. Emerging adults and their parents share a stable mutual attachment, but they do not share the changes in their respective attachments.

15.
Front Psychol ; 12: 615162, 2021.
Article in English | MEDLINE | ID: mdl-33995176

ABSTRACT

With small to modest sample sizes and complex models, maximum likelihood (ML) estimation of confirmatory factor analysis (CFA) models can show serious estimation problems such as non-convergence or parameter estimates outside the admissible parameter space. In this article, we distinguish different Bayesian estimators that can be used to stabilize the parameter estimates of a CFA: the mode of the joint posterior distribution that is obtained from penalized maximum likelihood (PML) estimation, and the mean (EAP), median (Med), or mode (MAP) of the marginal posterior distribution that are calculated by using Markov Chain Monte Carlo (MCMC) methods. In two simulation studies, we evaluated the performance of the Bayesian estimators from a frequentist point of view. The results show that the EAP produced more accurate estimates of the latent correlation in many conditions and outperformed the other Bayesian estimators in terms of root mean squared error (RMSE). We also argue that it is often advantageous to choose a parameterization in which the main parameters of interest are bounded, and we suggest the four-parameter beta distribution as a prior distribution for loadings and correlations. Using simulated data, we show that selecting weakly informative four-parameter beta priors can further stabilize parameter estimates, even in cases when the priors were mildly misspecified. Finally, we derive recommendations and propose directions for further research.

16.
Science ; 372(6540): 338-340, 2021 04 23.
Article in English | MEDLINE | ID: mdl-33888624
17.
Psychometrika ; 86(1): 190-214, 2021 03.
Article in English | MEDLINE | ID: mdl-33544300

ABSTRACT

Complex interactive test items are becoming more widely used in assessments. Being computer-administered, assessments using interactive items allow logging time-stamped action sequences. These sequences pose a rich source of information that may facilitate investigating how examinees approach an item and arrive at their given response. There is a rich body of research leveraging action sequence data for investigating examinees' behavior. However, the associated timing data have been considered mainly on the item-level, if at all. Considering timing data on the action-level in addition to action sequences, however, has vast potential to support a more fine-grained assessment of examinees' behavior. We provide an approach that jointly considers action sequences and action-level times for identifying common response processes. In doing so, we integrate tools from clickstream analyses and graph-modeled data clustering with psychometrics. In our approach, we (a) provide similarity measures that are based on both actions and the associated action-level timing data and (b) subsequently employ cluster edge deletion for identifying homogeneous, interpretable, well-separated groups of action patterns, each describing a common response process. Guidelines on how to apply the approach are provided. The approach and its utility are illustrated on a complex problem-solving item from PIAAC 2012.


Subject(s)
Computers , Problem Solving , Cluster Analysis , Psychometrics
18.
Educ Psychol Meas ; 80(3): 522-547, 2020 Jun.
Article in English | MEDLINE | ID: mdl-32425218

ABSTRACT

So far, modeling approaches for not-reached items have considered one single underlying process. However, missing values at the end of a test can occur for a variety of reasons. On the one hand, examinees may not reach the end of a test due to time limits and lack of working speed. On the other hand, examinees may not attempt all items and quit responding due to, for example, fatigue or lack of motivation. We use response times retrieved from computerized testing to distinguish missing data due to lack of speed from missingness due to quitting. On the basis of this information, we present a new model that allows to disentangle and simultaneously model different missing data mechanisms underlying not-reached items. The model (a) supports a more fine-grained understanding of the processes underlying not-reached items and (b) allows to disentangle different sources describing test performance. In a simulation study, we evaluate estimation of the proposed model. In an empirical study, we show what insights can be gained regarding test-taking behavior using this model.

19.
Multivariate Behav Res ; 55(3): 425-453, 2020.
Article in English | MEDLINE | ID: mdl-31448968

ABSTRACT

For adequate modeling of missing responses, a thorough understanding of the nonresponse mechanisms is vital. As a large number of major testing programs are in the process or already have been moving to computer-based assessment, a rich body of additional data on examinee behavior becomes easily accessible. These additional data may contain valuable information on the processes associated with nonresponse. Bringing together research on item omissions with approaches for modeling response time data, we propose a framework for simultaneously modeling response behavior and omission behavior utilizing timing information for both. As such, the proposed model allows (a) to gain a deeper understanding of response and nonresponse behavior in general and, in particular, of the processes underlying item omissions in LSAs, (b) to model the processes determining the time examinees require to generate a response or to omit an item, and (c) to account for nonignorable item omissions. Parameter recovery of the proposed model is studied within a simulation study. An illustration of the model by means of an application to real data is provided.


Subject(s)
Algorithms , Computer Simulation , Models, Statistical , Reaction Time/physiology , Data Interpretation, Statistical , Humans
20.
Br J Math Stat Psychol ; 73 Suppl 1: 83-112, 2020 11.
Article in English | MEDLINE | ID: mdl-31709521

ABSTRACT

In low-stakes assessments, test performance has few or no consequences for examinees themselves, so that examinees may not be fully engaged when answering the items. Instead of engaging in solution behaviour, disengaged examinees might randomly guess or generate no response at all. When ignored, examinee disengagement poses a severe threat to the validity of results obtained from low-stakes assessments. Statistical modelling approaches in educational measurement have been proposed that account for non-response or for guessing, but do not consider both types of disengaged behaviour simultaneously. We bring together research on modelling examinee engagement and research on missing values and present a hierarchical latent response model for identifying and modelling the processes associated with examinee disengagement jointly with the processes associated with engaged responses. To that end, we employ a mixture model that identifies disengagement at the item-by-examinee level by assuming different data-generating processes underlying item responses and omissions, respectively, as well as response times associated with engaged and disengaged behaviour. By modelling examinee engagement with a latent response framework, the model allows assessing how examinee engagement relates to ability and speed as well as to identify items that are likely to evoke disengaged test-taking behaviour. An illustration of the model by means of an application to real data is presented.


Subject(s)
Educational Measurement/statistics & numerical data , Models, Psychological , Models, Statistical , Test Taking Skills/psychology , Test Taking Skills/statistics & numerical data , Bayes Theorem , Choice Behavior , Computer Simulation , Data Interpretation, Statistical , Decision Making , Humans , Markov Chains , Monte Carlo Method , Motivation , Reaction Time
SELECTION OF CITATIONS
SEARCH DETAIL
...